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Abstract 

The  problem  of  1-bit  compressive  sampling  is  addressed  in  this  paper.  We  introduce  an 
optimization  model  for  reconstruction  of  sparse  signals  from  1-bit  measurements.  The  model 
targets  a  solution  that  has  the  least  ^o-norm  among  all  signals  satisfying  consistency  constraints 
stemming  from  the  1-bit  measurements.  An  algorithm  for  solving  the  model  is  developed. 
Convergence  analysis  of  the  algorithm  is  presented.  Our  approach  is  to  obtain  a  sequence 
of  optimization  problems  by  successively  approximating  the  £o"iiorm  and  to  solve  resulting 
problems  by  exploiting  the  proximity  operator.  We  examine  the  performance  of  our  proposed 
algorithm  and  compare  it  with  the  binary  iterative  hard  thresholding  (BIHT)  [11]  a  state-of- 
the-art  algorithm  for  1-bit  compressive  sampling  reconstruction.  Unlike  the  BIHT,  our  model 
and  algorithm  does  not  require  prior  knowledge  on  the  sparsity  of  the  signal.  This  makes  our 
proposed  work  a  promising  practical  approach  for  signal  acquisition. 


1  Introduction 

Compressive  sampling  is  a  recent  advance  in  signal  acquisition  [4,  5].  It  provides  a  method  to 
reconstruct  a  sparse  signal  x  £  from  linear  measurements 

y  =  (1) 

where  $  is  a  given  mx  n  measurement  matrix  with  m  <  n  and  y  £  is  the  measurement  vector 
acquired.  The  objective  of  compressive  sampling  is  to  deliver  an  approximation  to  x  from  y  and 
$.  It  has  been  demonstrated  that  the  sparse  signal  x  can  be  recovered  exactly  from  ?/  if  has 
Gaussian  i.i.d.  entries  and  satishes  the  restricted  isometry  property  [5].  Moreover,  this  sparse  signal 
can  be  identihed  as  a  vector  that  has  the  smallest  .^o-norm  among  all  vectors  yielding  the  same 
measurement  vector  y  under  the  measurement  matrix 

However,  the  success  of  the  reconstruction  of  this  sparse  signal  is  based  on  the  assumption 
that  the  measurements  have  infinite  bit  precision.  In  realistic  settings,  the  measurements  are  never 
exact  and  must  be  discretized  prior  to  further  signal  analysis.  In  practice,  these  measurements 
are  quantized,  a  mapping  from  a  continuous  real  value  to  a  discrete  value  over  some  finite  range. 
As  usual,  quantization  inevitably  introduces  errors  in  measurements.  The  problem  of  estimating 
a  sparse  signal  from  a  set  of  quantized  measurements  has  been  addressed  in  recent  literature. 
Surprisedly,  it  has  been  demonstrated  theoretically  and  numerically  that  1-bit  per  measurement  is 
enough  to  retain  information  for  sparse  signal  reconstruction.  As  pointed  out  in  [3,  11],  quantization 
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to  1-bit  measurements  is  appealing  in  practical  applications.  First,  1-bit  quantizers  are  extremely 
inexpensive  hardware  devices  that  test  values  above  or  below  zeros,  enabling  simple,  efficient, 
and  fast  quantization.  Second,  1-bit  quantizers  are  robust  to  a  number  of  non-linear  distortions 
applied  to  measurements.  Third,  1-bit  quantizers  do  not  suffer  from  dynamic  range  issues.  Due  to 
these  attractive  properties  of  1-bit  quantizers,  in  this  paper  we  will  develop  efficient  algorithms  for 
reconstruction  of  sparse  signals  from  1-bit  measurements. 

The  1-bit  compressive  sampling  framework  originally  introduced  in  [3]  is  briefly  described  as 
follows.  Formally,  it  can  be  written  as 


y  =  A{x)  :=  sign($x),  (2) 

where  the  function  sign(-)  denotes  the  sign  of  the  variable,  element-wise,  and  zero  values  are  assigned 
to  be  -|-1.  Thus,  the  measurement  operator  called  a  1-bit  scalar  quantizer,  is  a  mapping  from 
M”  to  the  Boolean  cube  B™'  :=  {—1, 1}™.  Note  that  the  scale  of  the  signal  has  been  lost  during 
the  quantization  process.  We  search  for  a  sparse  signal  x*  in  the  unit  ball  of  such  that  the 
sparse  signal  x*  is  consistent  with  our  knowledge  about  the  signal  and  measurement  process,  i.e., 
A{x*)  =  A{x). 

The  problem  of  reconstructing  a  sparse  signal  from  its  1-bit  measurements  is  generally  non- 
convex,  and  therefore  it  is  a  challenge  to  develop  an  algorithm  that  can  find  a  desired  solution. 
Nevertheless,  since  this  problem  was  introduced  in  [3]  in  2008,  there  are  several  algorithms  that  have 
been  developed  for  attacking  it  [3,  13,  19,  21].  Among  those  existing  1-bit  compressive  sampling 
algorithms,  the  binary  iterative  hard  thresholding  (BIHT)  [11]  exhibits  its  superior  performance  in 
both  reconstruction  error  and  as  well  as  consistency  via  numerical  simulations  over  the  algorithms  in 
[3,  13].  When  there  are  a  lot  of  sign  flips  in  the  measurements,  a  method  based  on  adaptive  outlier 
pursuit  for  1-bit  compressive  sampling  was  proposed  in  [21].  The  algorithms  in  [11,  21]  require 
the  sparsity  of  the  desired  signal  to  be  given  in  advance.  This  requirement,  however,  is  hardly 
satisfied  in  practice.  By  keeping  only  the  sign  of  the  measurements,  the  magnitude  of  the  signal 
is  lost.  The  models  associated  with  the  aforementioned  algorithms  seek  sparse  vectors  x  satisfying 
consistency  constraints  (2)  in  the  unit  sphere.  As  a  result,  these  models  are  essentially  non-convex 
and  non-smooth.  In  [19],  a  convex  minimization  problem  is  formulated  for  reconstruction  of  sparse 
signals  from  1-bit  measurements  and  is  solved  by  linear  programming.  The  details  of  the  above 
algorithms  will  be  briefly  reviewed  in  the  next  section. 

In  this  paper,  we  introduce  a  new  Iq  minimization  model  over  a  convex  set  determined  by  con¬ 
sistency  constraints  for  I-bit  compressive  sampling  recovery  and  develop  an  algorithm  for  solving 
the  proposed  model.  Our  model  does  not  require  prior  knowledge  on  the  sparsity  of  the  signal, 
therefore,  is  referred  to  as  the  blind  1-bit  compressive  sampling  model.  Our  approach  for  dealing 
with  our  proposed  model  is  to  obtain  a  sequence  of  optimization  problems  by  successively  approx¬ 
imating  the  £o-iiorm  and  to  solve  resulting  problems  by  exploiting  the  proximity  operator  [17]. 
Convergence  analysis  of  our  algorithm  is  presented. 

This  paper  is  organized  as  follows.  In  Section  2  we  review  and  comment  current  I-bit  com¬ 
pressive  sampling  models  and  then  introduce  our  own  model  by  assimilating  advantages  of  existing 
models.  Heuristics  for  solving  the  proposed  model  are  discussed  in  Section  3.  Convergence  analysis 
of  the  algorithm  for  the  model  is  studied  in  Section  4.  A  numerical  implementable  algorithm  for  the 
model  is  presented  in  Section  5.  The  performance  of  our  algorithm  is  demonstrated  and  compared 
with  the  BIHT  in  Section  6.  We  present  our  conclusion  in  Section  7. 
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2  Models  for  One-Bit  Compressive  Sampling 

In  this  section,  we  begin  with  reviewing  existing  modeis  for  reconstruction  of  sparse  signais  from 
1-bit  measurements.  After  anaiyzing  these  modeis,  we  propose  our  own  modei  that  assimiiates  the 
advantages  of  the  existing  ones. 

Using  matrix  notation,  the  1-bit  measurements  in  (2)  can  be  equivalently  expressed  as 

>  0,  (3) 


where 

Y  :=  diag(y) 

is  an  m  X  m  diagonal  matrix  whose  ith  diagonal  element  is  the  ilh.  entry  of  y.  The  expression 
y<hx  >  0  in  (3)  means  that  all  entries  of  the  vector  Y^x  are  no  less  than  0.  Hence,  we  can  treat 
the  1-bit  measurements  as  sign  constraints  that  should  be  enforced  in  the  construction  of  the  signal 
X  of  interest.  In  what  follows,  equation  (3)  is  referred  to  as  sign  constraint  or  consistency  condition, 
interchangeably. 

The  optimization  model  for  reconstruction  of  a  sparse  signal  from  1-bit  measurements  in  [3]  is 

min||x||i  s.t.  >  0  and  ||3:||2  =  1,  (4) 


where  ||  •  ||i  and  ||  •  ||2  denote  the  .^i-norm  and  ^2-iiorm  of  a  vector,  respectively.  In  model  (4),  the 
^i-norm  objective  function  is  used  to  favor  sparse  solutions,  the  sign  constraint  Y ^x  >  0  is  used  to 
impose  the  consistency  between  the  1-bit  measurements  and  the  solution,  the  constraint  ||a:||2  =  1 
ensures  a  nontrivial  solution  lying  on  the  unit  12  sphere. 

Instead  of  solving  model  (4)  directly,  a  relaxed  version  of  model  (4) 

min  I  Allx  111 -h  ^h((y4>x)i)|  s.t.  11x112  =  1  (5) 


was  proposed  in  [3]  and  solved  by  employing  a  variation  of  the  hxed  point  continuation  algorithm 
in  [10].  Here  A  is  a  regularization  parameter  and  h  is  chosen  to  be  the  one-sided  l\  (or  i2)  function, 
defined  at  2;  €  M  as  follows 


h{z) 


1^1  (or  if  2;  <  0; 

0,  otherwise. 


(6) 


We  remark  that  the  one-sided  £2  function  was  adopted  in  [3]  due  to  its  convexity  and  smoothness 
properties  that  are  required  by  a  fixed  point  continuation  algorithm. 

In  [13]  a  restricted-step-shrinkage  algorithm  was  proposed  for  solving  model  (4).  This  algorithm 
is  similar  in  sprit  to  trust-region  methods  for  nonconvex  optimization  on  the  unit  sphere  and  has 
a  provable  convergence  guarantees. 

Binary  iterative  hard  thresholding  (BIHT)  algorithms  were  recently  introduced  for  reconstruc¬ 
tion  of  sparse  signals  from  1-bit  measurements  in  [11].  The  BIHT  algorithms  are  developed  for 
solving  the  following  constrained  optimization  model 

m 

min  h{{Y ^x)i)  s.t.  ||3:||o  <  s  and  ||x||2  =  1,  (7) 

i=l 

where  h  is  defined  by  equation  (6),  s  is  a  positive  integer,  and  the  Iq-hovui  ||x||o  counts  the  number 
of  non-zero  entries  in  x.  Minimizing  the  objective  function  of  model  (7)  enforces  the  consistency 
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condition  (3).  The  BIHT  algorithms  for  model  (7)  are  a  simple  modification  of  the  iterative  thresh¬ 
olding  algorithm  proposed  in  [2].  It  was  shown  numerically  that  the  BIHT  algorithms  perform 
significantly  better  than  the  other  aforementioned  algorithms  in  [3,  13]  in  terms  of  both  recon¬ 
struction  error  as  well  as  consistency.  Numerical  experiments  in  [11]  further  show  that  the  BIHT 
algorithm  with  h  being  the  one-sided  function  performs  better  in  low  noise  scenarios  while  the 
BIHT  algorithm  with  h  being  the  one-sided  £2  function  perform  better  in  high  noise  scenarios. 
Recently,  a  robust  method  for  recovering  signals  from  1-bit  measurements  using  adaptive  outlier 
pursuit  was  proposed  for  the  measurements  having  noise  (i.e.,  sign  flips)  in  [21]. 

The  algorithms  reviewed  above  for  1-bit  compressive  sampling  are  developed  for  optimization 
problems  having  convex  objective  functions  and  non-convex  constraints.  In  [19]  a  convex  opti¬ 
mization  program  for  reconstruction  of  sparse  signals  from  1-bit  measurements  was  introduced  as 
follows: 

minllxlli  s.t.  Y^x>0  and  ||$x||i=p,  (8) 

where  p  is  any  fixed  positive  number.  The  first  constraint  Y >  0  requires  that  a  solution  to  model 
(8)  should  be  consistent  with  the  1-bit  measurements.  If  a  vector  x  satisfies  the  first  constraint,  so  is 
ax  for  all  0  <  a  <  1.  Hence,  an  algorithm  for  minimizing  the  .^i-norm  by  only  requiring  consistency 
with  the  measurements  will  yield  the  solution  x  being  zero.  The  second  constraint  ||<hx||i  =  pis  then 
used  to  prevent  model  (8)  from  returning  a  zero  solution,  thus,  resolves  the  amplitude  ambiguity. 
By  taking  the  first  constraint  into  consideration,  we  know  that  ||<I’a;||i  =  {y,^x),  therefore,  the 
second  constraint  is  x)  =  p  which  is  linear.  This  confirms  that  both  objective  function  and 

constraints  of  model  (8)  are  convex.  It  was  further  pointed  out  in  [19]  that  model  (8)  can  be  cast 
as  a  linear  program.  As  comparing  model  (8)  with  model  (4),  both  the  constraint  ||x||2  =  1  in 
model  (4)  and  the  constraint  ||4*a:||i  =  p  in  model  (8),  the  only  difference  between  both  models, 
enforce  a  non-trivial  solution.  However,  as  we  have  already  seen,  model  (8)  with  the  constraint 
||<hx||i  =  p  can  be  solved  by  a  computationally  tractable  algorithm. 

Let  us  further  comment  on  models  (7)  and  (8).  First,  the  sparsity  constraint  in  model  (7)  is 
impractical  since  the  sparsity  of  the  underlying  signal  is  unknown  in  general.  Therefore,  instead  of 
imposing  this  sparse  constraint,  we  consider  to  minimize  an  optimization  model  having  the  £o-i^orm 
as  its  objective  function.  Second,  although  model  (8)  can  be  tackled  by  efficient  linear  programming 
solvers  and  the  solution  of  model  (8)  preserves  the  effective  sparsity  of  the  underlying  signal  (see 
[19]),  the  solution  is  not  necessarily  sparse  in  general  as  shown  in  our  numerical  experiments  (see 
Section  6).  Motivated  by  the  aforementioned  models  and  the  associated  algorithms,  we  plan  in  this 
paper  to  reconstruct  sparse  signals  from  1-bit  measurements  via  solving  the  following  constrained 
optimization  model 

minllxllo  s.t.  >  0  and  ||4>x||i=p,  (9) 

where  p  is  again  a  arbitrary  positive  number.  This  model  has  the  ^o-i^orm  as  its  objective  function 
and  inequality  Y^x  >  0  and  equality  ||$x||i  =  p  as  its  convex  constraints. 

We  remark  that  the  actual  value  of  p  is  not  important  as  long  as  it  is  positive.  More  precisely, 
suppose  that  S  and  are  two  sets  collecting  all  solutions  of  model  (9)  with  p  =  1  and  p  =  p*  >  0, 
respectively.  If  x  G  5,  that  is,  Y^x  >  0  and  ||4’a;||i  =  1,  then,  by  denoting  x^  :=  p*x,  it  can  be 
verified  that  ||a;*||o  =  II^^Ho;  Y^x^  >  0,  and  =  p^.  That  indicates  x^  G  S^.  Therefore,  we 

have  that  p^S  C  .  Conversely,  we  can  show  that  C  p^S  by  reverting  above  steps.  Hence, 
p^S  =  .  Without  loss  of  generality,  the  positive  number  p  is  always  assumed  to  be  1  in  the  rest 

part  of  the  paper. 

To  close  this  section,  we  compare  model  (7)  and  our  proposed  model  (9)  in  the  following  result. 

Proposition  2.1  Let  y  G  he  the  1-hit  measurements  from  an  m  x  n  measurement  matrix  $ 
via  equation  (2)  and  let  s  he  a  positive  integer.  Assume  that  the  vector  x  G  M”  is  a  solution  to 
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model  (9).  Then  model  (7)  has  the  unit  vector  as  its  solution  i/||x||o  <  s;  otherwise,  model  (7) 
can  not  have  a  solution  satisfying  the  consistency  constraint  i/  ||x||o  >  s. 

Proof:  Since  the  vector  x  is  a  solution  to  model  (9),  then  x  satishes  the  consistency  constraint 
Y^x  >  0.  Hence,  it,  together  with  dehnition  of  h  in  (6),  implies  that 


E'- 


=  0. 


=  1.  Hence,  the  vector  is  a  solution  of 
’  IfI|2 

model  (7)  if  ||x||o  <  s. 

On  the  other  hand,  if  ||x||o  >  s  then  all  solutions  to  model  (7)  do  not  satisfy  the  consistency 
constraint.  Suppose  this  statement  is  false.  That  is,  there  exists  a  solution  of  model  (7),  say  x^, 
such  that  >  0,  ||x^||o  <  s,  and  ||x^||2  =  1  hold.  Set  x^  :=  ||^^-  Then  ||x^||o  =  ||a^^||o  <  s, 

y<hx*  >  0,  and  ||<hx*||i  =  1.  Since  ||x*||o  <  ||a;||o)  h  turns  out  that  x  is  not  a  solution  of  model  (9). 
This  contracts  our  assumption  on  the  vector  x.  This  completes  the  proof  of  the  result.  □ 

From  Proposition  2.1,  we  can  see  that  the  sparsity  s  for  model  (7)  is  critical.  If  s  is  set  too 
large,  a  solution  to  model  (7)  may  not  be  the  sparsest  solution  satisfying  the  consistency  constraint; 
if  s  is  set  too  small,  solutions  to  model  (7)  cannot  satisfy  the  consistency  constraint.  In  contrast, 
our  model  (9)  does  not  require  the  sparsity  constraint  used  in  model  (7)  and  delivers  the  sparsest 
solution  satisfying  the  consistency  constraint.  Therefore,  these  properties  make  our  model  more 
attractive  for  1-bit  compressive  sampling  than  the  BIHT.  Since  sparsity  of  the  underlying  signal  is 
not  specihed  in  advance  in  model  (9),  we  refer  it  to  as  blind  1-bit  compressive  sampling  model. 


We  further  note  that 


lFl|2 


=  ||x||o  and 


3  An  Algorithm  for  the  Blind  1-Bit  Compressive  Sampling 

In  this  section,  we  will  develop  algorithms  for  the  proposed  model  (9).  We  first  reformulate 
model  (9)  as  an  unconstrained  optimization  problem  via  the  indicator  function  of  a  closed  convex 
set  in  It  turns  out  that  the  objective  function  of  this  unconstrained  optimization  problem 

is  the  sum  of  the  .^o-i^orm  and  the  indicator  function  composing  with  a  matrix  associated  with 
the  1-bit  measurements.  Instead  of  directly  solving  the  unconstrained  optimization  problem  we 
use  some  smooth  concave  functions  to  approximate  the  .^o-norm  and  then  linearize  the  concave 
functions.  The  resulting  model  can  be  viewed  as  an  optimization  problem  of  minimizing  a  weighted 
^i-norm  over  the  closed  convex  set.  The  solution  of  this  resulting  model  is  served  as  a  new  point  at 
which  the  concave  functions  will  be  linearized.  This  process  is  repeatedly  performed  until  a  certain 
stopping  criteria  is  met.  Several  concrete  examples  for  approximating  the  ^g-norm  are  provided  at 
the  end  of  this  section. 

We  begin  with  introducing  our  notation  and  recalling  some  background  from  convex  analysis. 
For  the  d-dimensional  Euclidean  space  the  class  of  all  lower  semicontinuous  convex  functions 
f  :  ^  (— oo,-|-oo]  such  that  dom/  :=  {x  G  :  /(x)  <  -|-oo}  7^  0  is  denoted  by  Fo(M'^).  The 

indicator  function  of  a  closed  convex  set  C  in  is  defined,  at  n  G  as 

.  .  f  0,  if  u  G  C; 
ic{u)  ■=  i  ,1  ■ 

[  -|-oo,  otherwise. 

Clearly,  lc  is  in  Fo(M'^)  for  any  closed  nonempty  convex  set  C. 
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Next,  we  reformulate  model  (9)  as  an  unconstrained  optimization  problem.  To  this  end,  from 
the  m  X  n  matrix  and  the  m-dimensional  vector  y  in  equation  (2),  we  define  an  (m  +  1)  x  n 
matrix 


B  := 


diag(y) 


(10) 


and  a  subset  of 


C  :=  {z  :  z  €  Zm+i  =  1  and  Zi  >  0,  i  =  1,2, . . .  ,  mj,  (11) 

respectively.  Then  a  vector  x  satisfies  the  two  constraints  of  model  (9)  if  and  only  if  the  vector  Bx 
lies  in  the  set  C.  Hence,  model  (9)  can  be  rewritten  as 

min{||a;||o  +  lc{Bx)  :  x  G  R"'}.  (12) 

Problem  (12)  is  known  to  be  NP-complete  due  to  the  non-convexity  of  the  ^o-aorm.  Thus,  there 
is  a  need  for  an  algorithm  that  can  pick  the  sparsest  vector  x  satisfying  the  relation  Bx  G  C.  To 
attack  this  .^o-norm  optimization  problem,  a  common  approach  that  appeared  in  recent  literature 
is  to  approximate  the  .^o-^orm  by  its  computationally  feasible  approximations.  In  the  context  of 
compressed  sensing,  we  review  several  popular  choices  for  dehning  the  .^o-norm  as  the  limit  of  a 
sequence.  More  precisely,  for  a  positive  number  e  G  (0, 1),  we  consider  separable  concave  functions 
of  the  form 

n 

F,ix)-.=  Y,M\xi\),  xGR",  (13) 

i=l 

where  /g  :  R+  — )•  R  is  strictly  increasing,  concave,  and  twice  continuously  differentiable  such  that 

lim  F^(x)  =  ||x||o,  for  all  x  G  R"".  (14) 

£->■0-1- 

Since  the  function  is  concave  and  smooth  on  R+  :=  [0,oo),  it  can  be  majorized  by  a  simple 
function  formed  by  its  first-order  Taylor  series  expansion  at  a  arbitrary  point.  Therefore,  at  any 
point  V  G  R”  the  following  inequality  holds 

F,{x)  <  F,{v)  +  (VF,(|u|),  |x|  -  |u|)  (15) 

for  all  x  G  R"'  with  |x|  ^  luj.  Here,  for  a  vector  u,  we  use  |u|  to  denote  a  vector  such  that  each 
element  of  |ti|  is  the  absolute  value  of  the  corresponding  element  of  u.  Clearly,  when  v  is  close  enough 
to  X,  the  expression  on  the  right-hand  side  of  (15)  provides  a  reasonable  approximation  to  the  one 
on  its  left-hand  side.  Therefore,  it  is  considered  as  a  computationally  feasible  approximation  to  the 
£o-norm  of  x.  With  such  an  approximation,  a  simplified  problem  is  solved  and  its  solution  is  used 
to  formulate  a  simplihed  problem  which  is  closer  to  the  ideal  problem  (12).  This  process  is  then 
repeated  until  the  solutions  to  the  simplified  problems  become  stationary  or  meet  a  termination 
criteria.  This  procedure  is  summarized  in  Algorithm  1. 

The  constant  terms  Ff^dx^^^)  and  (VTe(|x^^^|),  appeared  in  the  optimization  problem  in 

Algorithm  1  can  be  ignored  because  they  are  irrelevant  to  the  optimization  problem.  Hence  the 
expression  for  in  Algorithm  1  can  be  simplihed  as 

g  argmin  {(VF,(lx(^)|),|x|)+ic(5x)  :xGR”}.  (16) 

Since  is  strictly  concave  and  increasing  on  R+,  /'  is  positive  on  R_|_.  Hence,  (VT(;(|x(^)|),  Ixj)  = 
EiLi  m  can  be  viewed  as  the  weighted  £i-norm  of  x  having  /e(|x|^^|)  as  its  ith  weight. 
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Algorithm  1  (Iterative  scheme  for  model  (12)) 

Initialization:  choose  e  €  (0, 1)  and  let  S  M"  be  an  initial  point, 
repeat  (A;  >  0) 

Step  1:  Compute 

3;(fc+i)  g  argmin  jFedx*'^^!)  +  (VF^dx*-^^!),  |x|  —  +  ic{Bx)  :  x  €  M"’|  . 

until  a  given  stopping  criteria  is  met 


Thus,  the  objective  function  of  the  above  optimization  problem  is  convex.  Details  for  finding  a 
solution  to  the  problem  will  be  presented  in  the  next  section. 

In  the  rest  of  this  section,  we  list  several  possible  choices  of  the  functions  in  (13)  including 
but  not  limited  to  the  Mangasarian  function  in  [15],  the  Geman-McClure  function  in  [9],  and  the 
Log-Det  function  in  [8]. 

The  Mangasarian  function  is  given  as  follows: 

n 

F,(x)  =  E  ('  -  «■'”■''■)  • 

i=l 

where  x  £  M”.  This  function  is  used  to  approximate  the  .^o-norm  to  obtain  minimum-support 
solutions  (that  is,  solutions  with  as  many  components  equal  to  zero  as  possible).  The  usefulness  of 
the  Mangasarian  function  was  demonstrated  in  finding  sparse  solutions  of  underdetermined  linear 
systems  (see  [12]). 

The  Geman-McClure  function  is  given  by 


Feix)  = 

i=l 


\xi\/e 
Xi\/e  +  1’ 


(18) 


where  x  £  MT'.  In  a  regularization  framework  for  recovering  an  image/signal,  its  penalty  term  is 
often  formed  by  applying  this  function  to  a  linear  transform  of  the  underlying  image/signal.  This 
results  in  strongly  homogeneous  zones  in  the  solution.  The  corresponding  recovery  problem  can  be 
recovered  from  noisy  data  and  preserved  intact  from  small  variations  of  the  data  [18]. 

The  Log-Det  function  is  defined  as 


Feix)  =  Y 

i=l 


log(|xd/e  -F  1) 
log(l/e) 


(19) 


where  x  £  M”.  Notice  that  ||x||o  is  equal  to  the  rank  of  the  diagonal  matrix  diag(x).  The  function 
Fe(x)  is  equal  to  (log(l/e))“^  log(det(diag(x)  -|-  el))  +  n,  the  logarithm  of  the  determinant  of  the 
matrix  diag(x)  -|-  el.  Hence,  it  was  named  as  the  Log-Det  heuristic  and  used  for  minimizing  the 
rank  of  a  positive  semidefinite  matrix  over  a  convex  set  in  [8].  Constant  terms  can  be  ignored  since 
they  will  not  affect  the  solution  of  the  optimization  problem  (16).  Hence  the  Log-Det  function  in 
in  (19)  can  be  replaced  by 

n 

Fe{x)  =  (20) 

i=l 

The  function  Fg  for  the  above  three  choices  are  plotted  in  Figure  3.1  for  n  =  1  and  e  being 
|,  and  We  can  see  that  for  a  fixed  e  £  (0, 1)  the  Mangasarian  function  is  the  one  which  is 
the  most  closest  to  the  ^o-norm. 
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(a)  Mangasarian 


(b)  Geman-McClure 


(c)  Log-Det 


Figure  3.1:  Plots  of  Fi,  Fi,  Fj_,  with  n  =  1  for  (a)  the  Mangasarian  function;  (b)  the 
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Geman-McClure  function;  and  (c)  the  Log-Det  function. 


We  point  it  ont  that  both  the  Mangasarian  function  and  the  Geman-McGlure  function  are 
bounded  by  1,  therefore,  are  non-coercive  while  the  Log-Det  fnnction  is  coercive.  This  makes 
difference  in  convergence  analysis  of  the  associated  Algorithm  1  that  will  be  presented  in  the  next 
section.  In  what  follows,  the  function  F^  is  the  Mangasarian  function,  the  Geman-McClnre,  or  the 
Log-Det  fnnction.  We  specify  it  only  when  it  is  noted. 

4  Convergence  Analysis 

In  this  section,  we  shall  give  convergence  analysis  for  Algorithm  1.  We  begin  with  presenting  the 
following  resnlt. 

Theorem  4.1  Given  e  €  (0, 1),  G  M”,  and  the  set  C  defined  by  (11),  let  the  sequence  : 
k  G  N}  he  generated  by  Algorithm  1,  where  N  is  the  set  of  all  natural  numbers.  Then  the  following 
three  statements  hold: 

(i)  The  sequence  :  A:  G  N}  converges  when  F^  is  corresponding  to  the  Mangasarian 

function  (17),  the  Geman-McClure  function  (18)  or  the  Log-Det  function  (20); 

(a)  The  sequence  :  A:  G  N}  is  bounded  when  Fg  is  the  Log-Det  function; 

is  convergent  when  the  sequence  {x^^^  :  k  G  N}  is  bounded. 

Proof:  We  hrst  prove  Item  (i).  The  key  step  for  proving  it  is  to  show  that  the  seqnence  {^^(x^^^)  : 
A;  G  N}  is  decreasing  and  bounded  below.  The  boundedness  of  the  seqnence  is  dne  to  the  fact  that 
Fe(0)  <  Fe(x^^)).  From  Step  1  of  Algorithm  1  or  equation  (16),  one  can  immediately  have  that 

tc(Fx(^+^))  =  0  and  (VF,(|x(^)|),  |x(*^+^)|)  <  (VF,(|x(^)|),  |x(^)|).  (21) 

By  identifying  x^^'i  and  ,  respectively,  as  v  and  x  in  (15)  and  nsing  the  inequality  in  (21), 

we  get  Fe(x(^+^))  <  Ffix^^'>).  Hence,  the  sequence  {^^(x^^))  :  A:  G  N}  is  decreasing  and  bonnded 
below.  Item  (i)  follows  immediately. 

When  Ff:  is  chosen  as  the  Log-Det  function,  the  coerciveness  of  together  with  Item  (i)  implies 
that  the  seqnence  {x^^^  :  A:  G  N}  must  be  bounded,  that  is,  Item  (ii)  holds. 


Finally,  we  prove  Item  (iii).  From  the  second-order  Taylor  expansion  of  the  function  at 
we  have  that 

(22) 

where  v  is  some  point  in  the  line  segment  linking  the  points  and  \x  W|  and  V^F^{v)  is  the 

Hessian  matrix  of  F).  at  the  point  v. 

By  (21),  the  second  term  on  the  right-hand  of  equation  (22)  is  less  than  0.  By  equation  (20), 
S/'^F^iy)  for  v  lying  in  the  hrst  octant  of  M”'  is  a  diagonal  matrix  and  is  equal  to 


r  ^1  -I 

e  e 

r  1  n 

r  1  n 

vn 

e  €  _ 

,  -2e 

1 

,  or  — 

1 

+ 

to 

1 _ 

which  corresponds  to  F^  being  the  Mangasarian,  the  Geman-McClure,  or  the  Log-Det  function. 
Hence,  the  matrix  S/'^F^{v)  is  negative  dehnite.  Since  the  sequence  :  k  £  N}  is  bounded,  there 
exists  a  constant  p  >  0  such  that 


(fc+i) 


a.(fe)|)Tv2_p^(„)(|3;(fe+l) 


xik+1) 


Putting  all  above  results  together  into  (22),  we  have  that 


F,{x^^+^'>)  <  F,(x(^)) 


P 

2 


x{k+i) 


Summing  the  above  inequality  from  A:  =  1  to  -|-oo  and  using  Item  (i)  we  get  the  proof  of  Item  (iii). 

□ 


From  Item  (iii)  of  Theorem  4.1,  we  have  —  \x  Will 

2  — )•  0  as  /c  — >•  oo. 

To  further  study  properties  of  the  sequence  {x^^^  :  A:  £  N}  generated  by  Algorithm  1,  the  matrix 
is  required  to  have  the  range  space  property  (RSP)  which  is  originally  introduced  in  [22].  With 
this  property  and  motivated  by  the  work  in  [22]  we  prove  that  Algorithm  1  can  yield  a  sparse 
solution  for  model  (12). 

Prior  to  presenting  the  dehnition  of  the  RSP,  we  introduce  the  notation  to  be  used  throughout 
the  rest  of  this  paper.  Given  a  set  5  C  {1,  2, ...  ,  n},  the  symbol  jiSj  denotes  the  cardinality  of  S, 
and  :=  {1,  2, . . .  ,  n}  \  5  is  the  complement  of  S.  Recall  that  for  a  vector  u,  by  abuse  of  notation, 
we  also  use  juj  to  denote  the  vector  whose  elements  are  the  absolute  values  of  the  corresponding 
elements  of  u.  For  a  given  matrix  A  having  n  columns,  a  vector  u  in  M”,  and  a  set  S  C  {1,  2, . . .  ,  n}, 
we  use  the  notation  As  to  denote  the  submatrix  extracted  from  A  with  column  indices  in  S,  and 
us  the  subvector  extracted  from  u  with  component  indices  in  S. 

Definition  4.2  (Range  Space  Property  (RSP))  Let  A  he  anmxn  matrix.  Its  transpose  A'^ 
is  said  to  satisfy  the  range  space  property  (RSP)  of  order  K  with  a  constant  p  >  0  if  for  all  sets 
S  C  {1, . . .  ,n}  with  jS]  >  K  and  for  all  ^  in  the  range  space  of  A~^  the  following  inequality  holds 

||?5=l|i  <  pII'^sIIi- 


We  remark  that  if  the  transpose  of  an  m  x  n  matrix  B  has  the  RSP  of  order  K  with  a  constant 
p  >  0,  then  for  every  non-empty  set  S'  C  {1, . . .  ,n},  the  transpose  of  the  matrix  Bs,  denoted  by 
RJ,  has  the  RSP  of  order  K  with  constant  p  as  well. 
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The  next  result  shows  that  if  the  transpose  of  the  matrix  B  in  Algorithm  1  possesses  the  RSP, 
then  Algorithm  1  can  lead  to  a  sparse  solution  for  model  (12).  To  this  end,  we  define  a  mapping 
u  :  — )■  such  that  the  zth  component  of  the  vector  (t{u)  is  the  fth  largest  component  of  |n|. 

Proposition  4.3  Let  B  be  the  {m  +  1)  x  n  matrix  he  defined  by  (10)  and  let  :  k  G  N}  be  the 
sequence  generated  by  Algorithm  1.  Assume  that  the  matrix  B^  has  the  RSP  of  order  K  with  p  >  0 
satisfying  (1  +  p)K  <  n.  Suppose  that  the  sequence  :  /c  G  N}  is  bounded.  Then  {a{x^^'^))n  the 
nth  largest  component  of  x^^^  converges  to  0. 

Proof:  Suppose  this  proposition  is  false.  Then  there  exist  a  constant  7  >  0  and  a  subsequence 
{xi^A  :  j  G  N}  such  that  ((t(x^^-’)))„  >  27  >  0  for  all  j  G  N.  From  Item  (hi)  of  Theorem  4.1  we 
have  that 

(u(x('=^+i)))„  >  7  (23) 

for  all  sufficient  large  j.  For  simplicity,  we  set  ~  VFe(|x^^^^|).  Hence,  by  inequality  (23)  and 
Fg,  we  know  that 

>  0,  and  y^^A  ^  q  (24) 

for  all  sufficient  large  j.  In  what  follows,  we  assume  that  the  integer  j  is  large  enough  such  that 
the  above  inequalities  in  (24)  hold. 

Since  the  vector  ig  obtained  through  Step  1  of  Algorithm  I,  i.e.,  equation  (16),  then  by 

Fermat’s  rule  and  the  chain  rule  of  sub  differential  we  have  that 

0  =  diag(2/(^^))a||  •  ||i(diag(y(^^))x(^^+^))  +  B^b<^’^^+A, 

where  G  dtciBx^^:>^Af  By  (24),  we  get 

(9||  •  ||i(diag(?/(^'’))x(*’-’'+^))  =  {(sign(xf-’^^^),sign(x2^'’^^^), . . .  ,sign(xn'’^^^))"^}. 

Thus 

where  =  B~^b^^i~^A  ig  in  the  range  of  B^ . 

Let  S  be  the  set  of  indices  corresponding  to  the  K  smallest  components  of  Hence, 

n—K  n 

i=l  i=n—K-\-l 

Since  B^  has  the  RSP  of  order  K  with  the  constant  p,  we  have  that  ||i  <  ||i- 

Therefore, 

n—K  n 

EGfe'^’’))-^/’  E  (<'(!/''■■’))<■  (25) 

i=l  i=n—K+l 

However,  by  the  definition  of  a,  we  have  that 

n—K  n 

E  (''(!'“■■’))<  a  ("  -  E 

i=l  i=n—K-\-l 

These  inequalities  together  with  the  condition  (1  +  p)iL  <  n  lead  to 

n—K  n 

E  (<'(!''*’’))■>#>  E 

i=l  i=n—K+l 
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which  contradicts  to  (25).  This  completes  the  proof  of  the  proposition. 


□ 


From  Proposition  4.3,  we  conclude  that  a  sparse  solution  is  guaranteed  via  Algorithm  1  if  the 
transpose  of  B  satisfies  the  RSP.  Next,  we  answer  how  sparse  this  solution  will  be.  To  this  end, 
we  introduce  some  notation  and  develop  a  technical  lemma.  For  a  vector  x  G  M'^,  we  denote  by 
r(x)  the  set  of  the  indices  of  non-zero  elements  of  x,  i.e.,  t{x)  :=  {i  ■.  Xi  ^  0}.  For  a  sequence 
:  k  G  N},  a  positive  number  //,  and  an  integer  k,  we  define  :=  {i  ■  \x?\  >  lA- 

Lemma  4.4  Let  B  be  the  (m  +  1)  x  n  matrix  defined  by  (10),  let  be  the  Log- Det  function  defined 
by  (20),  and  let  :  k  G  N}  be  the  sequence  generated  by  Algorithm  1.  Assume  that  the  matrix 
B^  has  the  RSP  of  order  K  with  p  >  0  satisfying  {1  p)K  <  n.  Lf  there  exist  p.  >  pen  such 
that  \Ifj_{x^^'l)\  >  K  for  all  sufficient  large  k,  then  there  exists  a  /c''  G  N  such  that  ||x^*^^||o  <  n  and 
.^-(^.(fc+i))  ^  all  p  >  p'f 

Proof:  Set  :=  VT^dx^^^l).  Since  is  a  solution  to  the  optimization  problem  (16),  then 

by  Fermat’s  rule  and  the  chain  rule  of  subdifferential  we  have  that 

0  G  diag(y(^))9||  •  ||i(diag(y(^))x('=+^)) 

where  g  Hence,  if  0,  we  have  that  =  \{B~^b^^~^^'l)i\. 

For  i  G  we  have  that  |xj^^|  >  p  and  ^  feiA  A:  G  N,  where 

fe  =  log(-  -|-  e).  Furthermore,  there  exist  a  k'  such  that  |x^'''^|  >  0  for  f  G  Lfj^{x^^'^)  and  k  >  k'  due 
to  Item  (hi)  in  Theorem  4.1.  Thus,  we  have  for  all  k  >  k' 

where  IF*  =  nhme_>.o+  fAh)  =  ^  ^  positive  number  dependent  on  p. 

Now,  we  are  ready  to  prove  ||x*'^dlo  <  n  for  all  k  >  k" .  By  Proposition  4.3,  we  have  that 
{a{x^^'i))n  — >•  0  when  k  — >•  -|-oo.  Therefore,  there  exists  an  integer  k”  >  k'  such  that  |/^(x(^^)|  >  K 
and  0  <  (j{x^^^))n  <  min{^— e,  p}  for  all  k  >  k" .  Let  io  be  the  index  such  that  |Xj-^  =  (iT(x(^”)))n. 

We  will  show  that  x]  =  0.  If  this  statement  is  not  true,  that  is,  x)  is  not  zero,  then 

|(B^6(^"+1)),„|  =  mxfP\)  =  - >  pW*.  (26) 

However,  since  zq  is  not  in  the  set  I^(x^^'''i)  and  B^  satisfies  the  RSP,  we  have  that 
|(rT5(P'+i)).j  <  ^  \{B'^b^^''+%\<p  Y 

which  contradicts  to  (26).  Hence,  we  have  that  x^^  =  0  and  |r(x(^”'’'^))|  <  n.  By  replacing  k" 

by  k  +  1  and  repeating  this  process  we  can  obtain  x)^  =  0  for  all  G  N.  Therefore,  ||x||o  <  n 

for  all  k  >  k" .  This  process  can  be  also  applied  to  other  components  satisfying  xf^  =  0.  Thus 
there  exists  a  k"  £  N  such  that  t{x^^^)  C  t{x^^''^)  for  all  k  >k" .  □ 

With  Lemma  4.4,  the  next  result  shows  that  when  the  transpose  of  B  satisfies  the  RSP  there 
exists  a  cluster  point  of  the  sequence  generated  by  Algorithm  1  that  is  sparse  and  satisfies  the 
consistency  condition. 
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Theorem  4.5  Let  B  he  the  {m  +  1)  x  re  matrix  defined  by  (10),  let  he  the  Log-Det  function 
defined  by  (20),  and  let  :  A:  G  N}  6e  the  sequence  generated  by  Algorithm  1.  Assume  that  the 
matrix  BA  has  the  RSP  of  order  K  with  p  >  0  satisfying  {l  +  p)K  <  re.  Then  there  is  a  subsequence 
j  converges  to  a  [(1  +  p)K\ -sparse  solution,  that  is  (cr(x^^J^))  — )•  0  as 

j  — )•  +00  and  e  — )•  0. 


* 

Proof:  Suppose  the  theorem  is  false.  Then  there  exist  p*,  for  any  0  <  e*  <  ^,  there  exist  a 
e  G  (0,e*)  and  k'  such  that  ((T(a:(^)))|^(i+p)j^_|_ij  >  p*  for  all  k  >  k'.  It  implies  that  for  all  k  >  k' 

>  [{l  +  p)K  +  l\  >  {l  +  p)K>K.  (27) 

By  Lemma  4.4,  there  exist  a  k”  >  k'  such  that  ||x*'^)||o  <  re  and  C  t{xA  ''>)  for  all  k>k". 

Let  S  =  t{xA'1).  Thus  x^g}  =  0  for  all  k  >  k" .  Therefore,  the  optimization  problem  (16)  for 
updating  can  be  reduced  to  the  following  one 

x^^^  G  argmin{((VFe(|x^^^|))5,re)  +  l{{Bs)u)  :  u  G  (28) 

If  |r(a:(^"))|  >  \I^*{xA')'j\^  from  (27)  we  have  (1  +  p)K  <  IS"!.  Thus  from  Lemma  4.4  and  BJ 
having  RSP  with  the  same  parameters,  there  exist  a  k'"  >  k"  such  that  t{x^^'1)  <  t(x  (fc"))  for  all 
k  >  k"' .  Therefore,  by  induction,  there  must  exist  a  k  such  that  for  all  k  >  k 


t{x^^))  =  V(x('=)),  r(x^)  C  r(x(^)). 


It  means  that  for  all  A;  >  all  the  nonzero  components  of  x^^'l  are  bounded  below  by  p* .  Therefore, 
for  any  k  >  k,  the  updating  equation  (16)  is  reduced  by  (28)  with  S  =  I^*{x^^'t).  From  Lemma  4.3 
we  get  [c(x^^^)]|5|  — >•  0  which  contradicts  with  >  p* .  Therefore,  we  get  this  theorem.  □ 


5  An  Implementation  of  Algorithm  1 

In  this  section,  we  describe  in  detail  an  implementation  of  Algorithm  1  and  show  how  to  select  the 
parameters  of  the  associated  algorithm. 

Solving  problem  (16)  is  the  main  issue  for  Algorithm  1.  A  general  model  related  to  (16)  is 

min{||rx||i  +  (/9(ila:)  :  X  G  M”},  (29) 

where  F  is  a  diagonal  matrix  with  positive  diagonal  elements  and  p  is  in  Fo(M™'+^).  In  particular, 
if  we  choose  F  =  VT^dx^^^l)  and  ip  =  ic,  where  x^l  is  a  vector  in  M”,  e  is  a  positive  number,  C 
is  given  by  (11),  and  is  a  function  given  by  (13),  then  model  (29)  reduces  to  the  optimization 
problem  in  Algorithm  1. 

We  solve  model  (29)  by  using  recently  developed  first-order  primal-dual  algorithm  (see,  e.g., 
[6,  14]).  To  present  this  algorithm,  we  need  two  concepts  in  convex  analysis,  namely,  the  proximity 
operator  and  conjugate  function.  The  proximity  operator  was  introduced  in  [16,  17].  For  a  function 
/  G  Fo(M'^),  the  proximity  operator  of  /  with  parameter  A,  denoted  by  prox^j^^^,  is  a  mapping  from 
to  itself,  defined  for  a  given  point  x  G  by 

prox_,^y(x)  :=  argmin  I  ^11^  “  ^lli  +  /(^)  ■  u 
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Algorithm  2  PD-subroutine  (The  first-order  primal-dual  algorithm  for  solving  (29)) 

Input:  the  {m  -|-  1)  x  n  matrix  B  defined  by  (10);  two  positive  numbers  a  and  (3  satisfying  the 
relation  a(3  <  n^jp;  the  n  x  n  diagonal  matrix  T  with  all  diagonal  elements  positive;  and  the 
function  (p  G  ro(M"'). 

Initialization:  i  =  0  and  an  initial  guess  {u  G  X  x  M” 

repeat  (i  >0) 

Step  1:  Compute 

=  prox„||.||^„r  (x*  -  aB^ {2u^  - 

Step  2:  Compute 

=  prox^(^*(tt®  -|-  /3i?x*^^) 

Step  3:  Set  z  :=  f  -|-  1. 

until  a  given  stopping  criteria  is  met  and  the  corresponding  vectors  u^,  and  x®^^  are 

denoted  by  rt^®®®’,  zz®®®®®®,  and  x®®®®®®,  respectively. 

Output:  (zz®®®®®',u®®®®®®®,x®®®®®®')  =  PB{a,P,B,r,ip,u-^,u°,x°) 


The  conjugate  of  /  G  ro(M®*)  is  the  function  f*  G  ro(M®*)  defined  at  z  G  by 

f*{z)  :=  sup{(x,  z)  -  f{x)  :  x  G  M®^}. 

With  these  notation,  the  first-order  primal-dual  (PD)  method  for  solving  (29)  is  summarized  in 
Algorithm  2  (referred  to  as  PD-subroutine). 

Theorem  5.1  Let  B  be  an  {m  +  1)  x  n  matrix  defined  by  (10),  let  C  be  the  set  given  by  (11),  let 
a  and  (5  be  two  positive  numbers,  and  let  L  be  a  positive  such  that  L  >  ||i?|p,  where  ||il||  is  the 
largest  singular  value  of  B.  If 

afiL  <  1, 

then  for  any  arbitrary  initial  vector  (x“^,x®,zz®)  G  M®®  x  M®®  x  M®®®+^,  the  sequence  {x^  :  /c  G  N} 
generated  by  Algorithm  2  converges  to  a  solution  of  model  (29). 


The  proof  of  Theorem  5.1  follows  immediately  from  Theorem  1  in  [6]  or  Theorem  3.5  in  [14].  We 
skip  its  proof  here. 

Both  proximity  operators  proxQ,||.||^Qp  and  prox^,^*  should  be  computed  easily  and  efficiently 
in  order  to  make  the  iterative  scheme  in  Algorithm  2  numerically  efficient.  Indeed,  the  proximity 
operator  prox„||.||^op  is  given  at  2  G  M®®  as  follows:  for  j  =  1, 2, . . .  ,  n 

(prox„||.||ior(2:))^.  =  max{|zj|  -  07^,0}  •  sign(2:j),  (30) 


where  'jj  is  the  jih.  diagonal  element  of  P.  Using  the  well-known  Moreau  decomposition  (see,  e.g. 

[1,  16]) 

T 


prox^,^*  =1-13  prox  1  ^  o 


(31) 


we  can  compute  the  proximity  operator  prox^^^*  via  proxi^  which  depends  on  a  particular  form  of 
the  function  p.  As  our  purpose  is  to  develop  algorithms  for  the  optimization  problem  in  Algorithm  1, 
we  need  to  compute  the  proximity  operator  of  which  is  given  in  the  following. 
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Lemma  5.2  If  C  is  the  set  given  by  (11)  and  P  is  a  positive  number,  then  for  z  G  we  have 


that 

PrOX^,*(2;)  =  {zi  -  {zi)  +  ,Z2  -  iz2)  +  ,  (Zm)  +  ,2m+l  “  /?),  (32) 

where  (s)+  is  s  if  s  >  0  and  0  otherwise. 

Proof:  We  first  give  an  explicit  form  for  the  proximity  operator  proxj_^^.  Note  that  ic  =  ^ic  for 
/3  >  0  and  ic{z)  =  t{i}(zm+i)  +  YllLi  i[o,oo)(^j))  for  ^  Hence,  we  have  that 

proxi  (2:)  =  ((2;i)+,(2;2)+,...,(2;m)+,l),  (33) 

/3  I' 

where  (s)+  is  s  if  s  >  0  and  0  otherwise.  Here  we  use  the  facts  that  prox^^^  (s)  =  (s)+  and 
prox^j^j(s)  =  1  for  any  s  G  M. 

By  the  Moreau  decomposition  (31),  we  have  that  prox^^.  (2)  =  z  — /3proxi^^(^2;).  This  together 
with  equation  (33)  yields  (32).  □ 


Next,  we  comment  on  the  diagonal  matrix  T  in  model  (29).  When  the  function  (p  in  model  (29) 
is  chosen  to  be  lq,  then  the  relation  ap  =  p  holds  for  any  positive  number  a.  Hence,  by  rescaling 
the  diagonal  matrix  T  in  model  (29)  with  any  positive  number,  that  does  not  alter  the  solutions  of 
model  (29) .  Therefore,  we  can  assume  that  the  largest  diagonal  entry  of  T  is  always  equal  to  one. 

In  applications  of  Theorem  5.1  as  in  Algorithm  2,  we  should  make  the  product  of  a  and  /3  as 
close  to  l/||H|p  as  possible.  In  our  numerical  simulations,  we  always  set 


0.999 

~W¥' 


(34) 


In  such  the  way,  /3  is  essentially  the  only  parameter  that  needs  to  be  determined. 

Prior  to  computing  a  for  a  given  /3  by  equation  (34),  we  need  to  know  the  norm  of  the  matrix  B. 
When  min{m,  n}  is  small,  the  norm  of  the  matrix  B  can  be  computed  directly.  When  min{m,  n}  is 
large,  an  upper  bound  of  the  norm  of  the  matrix  B  is  estimated  in  terms  of  the  size  of  B  as  follows. 


Proposition  5.3  Let  be  an  m  x  n  matrix  with  i.i.d.  standard  Gaussian  entries  and  y  be  an 
m-dimensional  vector  with  its  component  being  +1  or  —1.  We  define  an  (m  + 1)  x  n  matrix  B  from 
$  and  y  via  equation  (10).  Then 

]E{||H||}  <  Vrri^n{y/n  +  y/m). 

Moreover, 

||H||  <  \/m  +  l{\fn  +  ^/m  +  t) 
holds  with  probability  at  least  1  —  for  all  t  >  0. 


Proof:  By  the  structure  of  the  matrix  B  in  (10),  we  know  that 


B\\  < 


diag(y) 


Therefore,  we  just  need  to  compute  the  norms  on  the  right-hand  side  of  the  above  inequality. 
Denote  by  Im  the  mxm  identity  matrix  and  Im  the  vector  with  all  its  components  being  1.  Then 


diag(2/) 


[diag(?/)  y]  = 


|T 


Im 

m 
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(a)  (m,  n)  =  (500, 1000) 


(b)  (m,n)  =  (1000,1000) 


(c)  (m,n)  =  (1500,1000) 


Figure  5.2:  Plots  (the  dotted  blue  lines)  of  the  norms  of  the  (m  +  1)  x  n  matrices  B  .  The  heights 
of  the  red  lines  are  computed  by  the  formula  y/m  +  l(\/n  +  y/m). 


which  is  a  special  arrow-head  matrix  and  has  m  +  1  as  its  largest  eigenvalue  (see  [20]).  Hence, 

y 


Furthermore,  by  using  random  matrix  theory  for  the  matrix  we  know  that  E{||$||}  <  y/n  +  y/m 
and  ||d>||  <  y/n  +  y/m  +  t  with  probability  at  least  1  —  2e“*  for  all  t  >  0  (see,  e.g.,  [7]).  This 
completes  the  proof  of  this  proposition.  □ 


The  dotted  blue  lines  in  Figure  5.2(a),  (b),  and  (c)  depict  the  norm  of  B  for  100  randomly 
generated  matrices  and  vectors  y  for  the  pair  {m,n)  with  three  different  choices  (500,1000), 
(1000,1000),  and  (1500,1000),  respectively.  Corresponding  to  these  choices,  the  mean  values  of 
||il||  are  about  815,  1276,  and  1711  while  the  upper  bounds  of  the  expected  values  of  ||il||  by 
Proposition  5.3  are  about  1208,  2001,  and  2726,  respectively.  We  can  see  that  the  norm  of  B  varies 
with  its  size  and  turns  to  be  a  big  number  when  the  value  of  min{m,  n}  is  relatively  large.  As 
a  consequence,  the  parameter  a  or  /  must  be  very  small  relative  to  the  other  by  equation  (34). 
Therefore,  in  what  follows,  the  used  matrix  B  in  model  (29)  is  considered  to  have  been  rescaled  in 
the  following  way: 


B  B 

— TT  or  - - 

B\\  y/m  +  l{y/n  +  y/m) 


(35) 


when  the  norm  of  B  can  be  computed  easily  or  not. 

The  complete  procedure  for  model  (12)  and  how  the  PD-subroutine  is  employed  are  summarized 
in  Algorithm  3. 


6  Numerical  Simulations 

In  this  section,  we  demonstrate  the  performance  of  Algorithm  3  for  1-bit  compressive  sampling 
reconstruction  in  terms  of  accuracy  and  consistency  and  compare  it  with  the  BIHT  algorithm. 

Through  this  section,  all  random  mxn  matrices  $  and  length-n,  s-sparse  vectors  x  are  generated 
based  on  the  following  assumption:  entries  of  and  x  on  its  support  are  i.i.d.  Gaussian  random 
variables  with  zero  mean  and  unit  variances.  The  locations  of  the  nonzero  entries  (i.e.,  the  support) 
of  X  are  randomly  permuted.  We  then  generate  the  1-bit  observation  vector  y  by  equation  (2). 
We  obtain  reconstruction  of  x*  from  y  by  using  the  BIHT  and  Algorithm  3.  The  quality  of  the 
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Algorithm  3  (Iterative  scheme  for  model  (12)) 

Input:  the  (m  +  1)  x  n  matrix  B  formed  by  an  m  x  n  matrix  $  and  an  m-dimensional  vector 
y  via  (10);  the  set  C  given  by  (11);  e  €  (0, 1),  and  r  >  0;  Omax  and  Cmin  be  two  real  numbers; 
the  maximum  iteration  number  femax- 

Initialization:  normalizing  B  according  to  (35);  T  being  the  n  x  n  identity  matrix;  an  initial 
guess  G  x  x  M”;  and  initial  parameters  /3  and  a  =  0.999//3. 

while  k  <  /cmax  do 
Step  1:  Compute 

(^oi4+i  ^  ^  ^  ,  x(^)) 

Step  2:  Update  T  as  the  scaled  matrix  diag(VFe(x(^+^)))  such  that  the  largest  diagonal 
element  of  T  is  one. 

Step  3:  If  q;  <  Omax,  update  a  2q;,  /3  •(—  /3/2;  if  e  >  eminj  update  e  ■(—  re; 

Step  4:  Update  /c  ■(—  /c  +  1. 

end  while 
Output: 


reconstructed  x*  is  measured  in  terms  of  the  signal-to- noise  ratio  (SNR)  in  dB 


SNR(x,x*) 


201ogio 


The  accuracy  of  the  BIHT  and  Algorithm  3  is  measured  by  the  average  of  SNR  values  over  100 
trials  unless  otherwise  noted.  For  all  figures  in  this  section,  results  by  the  BIHT  and  Algorithm  3 
with  the  Mangasarian  function  (17),  the  Geman-McClure  function  (18)  and  the  Log-Det  function 
(20)  are  marked  by  the  symbols  “V”,  “o”,  and  respectively. 


6.1  Effects  of  using  inaccurate  sparsity  on  the  BIHT 

The  BIHT  requires  the  availability  of  the  sparsity  of  the  underlying  signals.  This  requirement  is, 
however,  not  known  in  practical  applications.  In  this  subsection,  we  demonstrate  through  numerical 
experiments  that  the  mismatched  sparsity  for  a  signal  will  degenerate  the  performance  of  the  BIHT. 

To  this  end,  we  fix  n  =  1000  and  s  =  10  and  consider  two  cases  of  m  being  500  and  1000.  For 
each  case,  we  vary  the  sparsity  input  for  the  BIHT  from  8  to  12  in  which  10  is  the  only  right  choice. 
Therefore,  there  are  total  ten  configurations.  For  each  configuration,  we  record  the  SNR  values  and 
the  numbers  of  sign  constraints  not  being  satisfied  of  the  reconstructed  signals  by  the  BIHT. 

Figure  6.3  depicts  the  SNR  values  of  the  experiments.  The  plots  in  the  top  row  of  Figure  6.3 
are  for  the  case  m  =  500  while  the  plots  in  the  bottom  row  are  for  the  case  m  =  1000.  The  marks 
in  each  plot  represent  the  pairs  of  the  SNR  values  with  a  mismatched  sparsity  input  (i.e.,  s  =  8, 
s  =  9,  s  =  11,  or  s  =  12  corresponding  to  the  column  1,  2,  3,  or  4)  and  with  the  correct  sparsity 
input  (i.e.,  s  =  10).  A  mark  below  the  red  line  indicates  that  the  BIHT  with  the  correct  sparsity 
input  works  better  than  the  one  with  an  incorrect  sparsity  input.  A  mark  that  is  far  away  from 
the  red  line  indicates  the  BIHT  with  the  correct  sparsity  input  works  much  better  than  the  one 
with  an  incorrect  sparsity  input  or  vice  versa.  Except  the  second  plot  in  the  top,  we  can  see  that 
the  BIHT  with  the  correct  sparsity  input  performs  better  than  the  one  with  an  inaccurate  sparsity 
input.  In  particular,  when  an  underestimated  sparsity  input  to  the  BIHT  is  used,  the  performance 
of  the  BIHT  will  be  significantly  reduced  (see  the  plots  in  the  first  two  columns  of  Figure  6.3). 
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Figure  6.3:  The  marks  in  the  plots  (from  the  left  column  to  the  right  column)  represent  the  pairs  of 
the  SNR  values  from  the  BIHT  with  the  correct  sparsity  input  (i.e.,  s  =  10)  and  incorrect  sparsity 
inputs  8,  9,  11,  and  12,  respectively.  We  fix  n  =  1000. 


(a)  (m,  n)  =  (500, 1000)  (b)  (m,  n)  =  (1000, 1000) 

Figure  6.4:  The  histograms  of  the  numbers  of  unsatisfied  consistency  conditions  over  200  trials. 


When  an  overestimated  sparsity  input  to  the  BIHT  is  used,  majority  marks  are  under  the  red  lines 
and  are  relatively  closer  to  the  red  lines  than  those  from  the  BIHT  with  underestimated  sparsity 
input.  We  further  report  that  the  average  SNR  values  for  the  sparsity  input  s  =  8,  9,  10,  11,  and 
12  for  m  =  500  are  21.89dB,  24.18dB,  23.25dB,  22.10dB,  and  21.00dB,  respectively.  Similarly,  for 
m  =  1000,  the  average  SNR  values  for  the  sparsity  input  s  =  8,  9,  10,  11,  and  12  are  19.77dB, 
26.37dB,  34.74dB,  31.12dB,  and  29.46dB,  respectively. 

Figure  6.4  (a)  and  (b)  illustrate  the  histograms  of  the  numbers  of  unsatished  consistency  condi¬ 
tions  over  200  trials  for  m  =  500  and  1000,  respectively.  We  can  see  from  Figure  6.4  (a)  that  the  use 
of  an  underestimated  sparsity  constraint  (s  =  8  or  9)  will  tend  to  yield,  on  average,  a  solution  with 
a  large  amount  of  sign  constraints  unsatisfied,  in  other  words,  under  the  current  setting  the  solu¬ 
tion  to  model  (7)  via  the  BIHT  does  not  satisfy  equation  (2).  As  expected,  when  an  overestimated 
sparsity  constraint  (s  =  11  or  12)  is  used,  the  sign  constraints  are  usually  satisfied. 

In  summary,  we  conclude  that  a  proper  chosen  sparsity  constraint  is  critical  for  the  success  of 
the  BIHT. 
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Figure  6.5:  Results  for  Plan-Vershynin’s  model  using  Linear  programming  over  100  trials. 

6.2  Plan-Vershynin’s  model  for  1-bit  reconstrnction 

Both  our  model  (9)  and  Plan-Vershynin’s  model  (8)  use  the  same  constraint  conditions.  Their 
objective  functions  are  different.  Our  model  uses  ^o-^orm  while  Plan-Vershynin’s  model  uses  .£1- 
norm.  As  suggested  in  [19],  linear  programming  can  be  applied  for  the  Plan-Vershynin  model.  We 
report  here  some  numerical  results  for  this  model. 

In  our  simulations,  we  fix  n  =  1000,  m  =  1000,  and  s  =  10.  All  simulations  were  performed  100 
trials.  Figure  6.5  illustrates  the  sparsity  of  the  reconstructions  of  all  trials  which  are  clearly  greater 
than  10  (indicated  by  the  solid  red  line  in  the  figure).  The  average  sparsity  of  the  reconstructions 
over  100  trials  is  23.42.  Recall  that  the  average  SNR  values  of  all  reconstructions  by  the  BIHT  is 
34.74dB. 

6.3  Performance  of  Algorithm  3 

Prior  to  applying  Algorithm  3  for  1-bit  compressive  sampling  problem,  parameters  fcmax,  t,  Omax, 
emin.  a,  and  e  in  Algorithm  3  need  to  be  determined.  Under  the  aforementioned  setting  for  the 
random  matrix  <I>  and  sparse  signal  x,  we  fix  fcmax  =  17,  r  =  |,  Omax  =  8000,  Cmin  =  10“^.  For 
the  functions  defined  by  (17),  (18),  and  (20),  we  set  the  pair  of  initial  parameters  {a,e)  as 
(500,0.25),  (500,0.125),  and  (250,0.125),  respectively.  The  iterative  process  in  the  PD-subroutine 
is  forced  to  stop  if  the  corresponding  number  of  iteration  exceeds  300.  These  parameters  are  used 
in  all  simulations  performed  by  Algorithm  3  in  the  rest  of  this  section. 

To  evaluate  the  performance  of  Algorithm  3  in  terms  of  SNR  values  at  various  scenarios,  we 
consider  three  configurations  for  the  size  of  the  random  matrix  $  and  the  sparsity  of  the  vector 
X.  In  the  first  configuration,  we  fix  n  =  1000  and  s  =  10  and  vary  m  such  that  the  ratio  m/n 
is  between  0.1  and  2.  In  the  second  configuration,  we  fix  m  =  1000  and  n  =  1000  and  vary  the 
sparsity  of  x  from  1  to  20.  In  the  third  configuration,  we  fix  m  =  1000  and  s  =  10  and  vary  n  from 
500  to  1400. 

For  every  cases  in  each  configuration,  we  compare  the  accuracy  of  Algorithm  3  with  the  BIHT 
by  computing  the  average  of  SNR  values  over  100  trials.  For  the  given  parameters  and  stopping 
criteria  adopted  by  Algorithm  3,  the  estimate  may  not  satisfy  the  consistency  condition  (3), 

that  is  the  signs  of  measurements  of  the  estimate  are  not  completely  consistent  with  that  of 

the  original  measurements.  Thus,  for  a  fair  comparison,  we  only  compute  the  average  of  SNR  values 
for  those  trials  that  both  reconstructions  from  the  BIHT  and  Algorithm  3  satisfy  the  consistency 
condition  (3)  and  we  say  the  corresponding  trials  are  valid. 

For  the  first  configuration,  the  SNR  values  in  decibels  of  the  average  reconstruction  errors 
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by  both  the  BIHT  and  Algorithm  3  are  depicted  in  Figure  6.6.  The  plots  demonstrate  that  our 
proposed  algorithm  performs  as  equally  good  as  the  BIHT,  in  particular,  when  m/n  is  greater  than 
1,  even  thought  our  algorithm  does  not  require  to  know  the  exact  sparsity  of  the  original  signal.  We 
can  see  that  Algorithm  3  with  the  Log-Det  function  (20)  (Figure  6.6(c))  performs  slightly  better 
than  with  the  Mangasarian  function  (17)  (Figure  6.6(a))  and  the  Geman-McClure  function  (18) 
(Figure  6.6(b)). 


BIHT 

-©-  Our  Algorithm 


BIHT 

— (—  Our  Algorithm 
1.5  2 


BIHT 

— Our  Algorithm 
1.5  2 


(a)  Mangasarian 


(b)  Geman-McClure 


(c)  Log-Det 


Figure  6.6:  Average  SNR  values  vs.  m/n  for  fixed  n  =  1000  and  s  =  10. 


Figure  6.7:  The  difference  of  SNR  values  of  estimates  by  Algorithm  3  and  the  BIHT  vs.  sparsity 
of  reconstructed  estimates  by  Algorithm  3.  We  fix  n  =  1000  and  s  =  10. 


Detailed  descriptions  for  valid  trials  for  m  =  200,  800,  1400,  and  2000  are  displayed  in  the 
columns  (from  left  to  right)  of  Figure  6.7,  respectively.  The  plots  in  the  rows  (from  the  top  to  the 
bottom)  of  Figure  6.7  illustrate  the  results  by  Algorithm  3  with  the  Mangasarian  function  (17),  the 
Geman-McGlure  function  (18),  and  the  Log-Det  function  (20),  respectively.  The  horizontal  axis  of 
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each  plot  represents  the  sparsity  of  the  reconstructed  signals  by  Algorithm  3  while  the  vertical  axis 
represents  the  difference  of  the  SNR  values  of  the  reconstructions  between  Algorithm  3  and  the 
BIHT.  Therefore,  the  marks  (“o”,  and  “*”)  above  the  dashed  horizontal  lines  indicate  that 

Algorithm  3  performs  better  than  the  BIHT  for  the  corresponding  trials.  Since  all  ideal  signals 
in  our  simulations  are  10-sparse,  the  marks  whose  horizontal  axis  are  bigger  than,  exactly  equal 
to,  or  smaller  than  10  imply  that  the  (.Q-noxva.  of  the  reconstructions  by  Algorithm  3  are  bigger 
than,  exactly  equal  to,  or  smaller  than  10,  respectively.  Thus,  the  ^o-norm  of  a  reconstruction 
over  10  indicates  that  the  reconstruction  is  not  a  global  minimizer  of  model  (9),  the  ^Q-r^oIUl  of  a 
reconstruction  being  10  indicates  that  the  sparsity  of  the  reconstruction  is  consistent  with  the  one  of 
the  original  test  signal,  the  ^o-norm  of  a  reconstruction  below  10  indicates  that  the  reconstruction  is 
potentially  a  global  minimizer  of  model  (9)  and  the  original  test  signal  is  not  a  solution  to  model  (9). 
We  can  conclude  from  Figure  6.7  that  (i)  the  reconstructions  by  Algorithm  3  with  sparsity  higher 
(res.  lower)  than  10  usually  have  lower  (res.  higher)  SNR  values  than  that  by  the  BIHT;  (ii) 
Increasing  m  (number  of  measurements)  tends  to  reduce  the  sparsity  of  the  reconstructions.  For 
example,  average  sparsity  of  the  reconstructions  for  m  =  200,  800,  1400,  and  2000  are,  respectively, 
11.74,  10.26,  10,  and  10.06  for  Algorithm  3  with  the  Mangasarian  function,  11.87,  10.53,  10.27,  and 
10  for  Algorithm  3  with  the  Geman-McClure  function,  and  11.53,  10.05,  9.88,  9.88  for  Algorithm  3 
with  the  Log-Det  function. 

For  the  second  configuration,  the  SNR  values  in  decibels  of  the  average  reconstruction  errors  by 
both  the  BIHT  and  Algorithm  3  are  compared  in  Figure  6.8  for  varying  sparsity  of  original  signals. 
The  plots  demonstrate  that  our  proposed  algorithm  performs  better  than  the  BIHT  for  sparsity  s 
being  2  and  6  to  10.  We  emphasize  again  that  unlike  the  BIHT  the  exact  sparsity  of  the  original 
signal  is  not  required  in  advance  by  Algorithm  3.  We  remark  that  when  s  =  1  both  the  BIHT 
and  Algorithm  3  find  an  exact  solution  to  model  (9).  This  phenomenon  was  also  reported  in  [13]. 
Detailed  descriptions  for  valid  trials  for  s  =  2,  8,  14,  and  20  are  displayed  in  the  columns  (from  left 
to  right)  of  Figure  6.9,  respectively.  The  marks  in  each  plot  of  Figure  6.9  have  the  same  meaning 
as  that  in  Figure  6.7.  For  fixed  m  =  1000  and  n  =  1000  we  can  draw  conclusions  from  Figure  6.9 
that  (i)  Algorithm  3  tends  to  produce  an  estimate  whose  sparsity  is  consistent  with  the  ideal  sparse 
signal;  (ii)  Algorithm  3  can  give  an  estimate  whose  sparsity  is  smaller  than  that  of  the  ideal  sparse 
signal,  in  particular,  when  the  sparsity  of  an  original  signal  is  relative  large. 


(a)  Mangasarian  (b)  Geman-McClure  (c)  Log-Det 


Figure  6.8:  Average  SNR  values  vs.  sparsity  of  original  signals  for  fixed  n  =  1000  and  m  =  1000. 

For  the  third  configuration,  the  SNR  values  in  decibels  of  the  average  reconstruction  errors  by 
both  the  BIHT  and  Algorithm  3  are  compared  in  Figure  6.10  for  fixed  m  =  1000  and  s  =  10  and 
varying  dimensions  of  original  signals.  The  plots  in  Figure  6.10  show  that  the  average  SNR  values 
for  reconstructions  by  Algorithm  3  are  lower  than  that  by  the  BIHT  in  most  cases.  This  is  due 
to  the  fact  that  the  BIHT  explores  an  unattainable  additional  information  on  the  sparsity  of  the 
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Figure  6.9:  The  difference  of  SNR  values  of  estimates  by  Algorithm  3  and  the  BIHT  vs.  sparsity 
of  reconstructed  estimates  by  Algorithm  3.  We  fix  n  =  1000  and  m  =  1000. 


original  signal.  Another  reason  which  we  can  see  from  Figure  6.11  is  that  reconstructions  with 
their  sparsity  larger  than  10  by  Algorithm  3  usually  have  lower  SNR  values  than  by  the  BIHT.  The 
marks  in  each  plot  of  Figure  6.11  have  the  same  meaning  as  that  in  Figures  6.7  and  6.9.  For  hxed 
m  =  1000  and  s  =  10  we  can  draw  conclusions  from  Figure  6.11  that  (i)  Algorithm  3  can  give  an 
estimate  whose  sparsity  is  smaller  than  that  of  the  ideal  sparse  signal  and  (ii)  Algorithm  3  with 
the  Log-Det  function  works  better  than  that  with  the  other  two. 


(a)  Mangasarian 


Figure  6.10:  Average  SNR  values  of  estimates  vs.  the  signal  size  n  for  fixed  m  =  1000  and  s  =  10. 


7  Summary  and  Conclusion 

In  this  paper  we  proposed  a  new  model  and  algorithm  for  1-bit  compressive  sensing.  Unlike  the 
state-of-the-art  BIHT  method,  our  model  does  not  need  to  know  the  sparsity  of  the  signal  of 


21 


Figure  6.11:  The  difference  of  SNR  values  of  estimates  by  Algorithm  3  and  the  BIHT  vs.  sparsity 
of  reconstructed  estimates  by  Algorithm  3.  We  hx  m  =  1000  and  s  =  10. 


interest.  We  demonstrated  the  performance  of  our  proposed  algorithm  for  reconstruction  from 
1-bit  measurements. 

It  would  be  of  interest  to  study  the  convergence  of  Algorithm  3  with  the  Mangasarian  function 
and  the  Geman-McClure  function  in  the  future.  It  would  be  highly  needed  to  adaptively  update  all 
parameters  in  Algorithm  3  so  that  consistent  reconstruction  can  be  always  achieved  with  improved 
accuracy  of  the  solution. 
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